Introduction

Data Description

## 'data.frame':    2240 obs. of  16 variables:
##  $ Year_Birth         : int  1970 1961 1958 1967 1989 1958 1954 1967 1954 1954 ...
##  $ Education          : chr  "Graduation" "Graduation" "Graduation" "Graduation" ...
##  $ Marital_Status     : chr  "Divorced" "Single" "Married" "Together" ...
##  $ Income             : num  84835 57091 67267 32474 21474 ...
##  $ Kidhome            : int  0 0 0 1 1 0 0 0 0 0 ...
##  $ Teenhome           : int  0 0 1 1 0 0 0 1 1 1 ...
##  $ Dt_Customer        : chr  "6/16/14" "6/15/14" "5/13/14" "5/11/14" ...
##  $ MntWines           : int  189 464 134 10 6 336 769 78 384 384 ...
##  $ MntFruits          : int  104 5 11 0 16 130 80 0 0 0 ...
##  $ MntMeatProducts    : int  379 64 59 1 24 411 252 11 102 102 ...
##  $ MntFishProducts    : int  111 7 15 0 11 240 15 0 21 21 ...
##  $ MntSweetProducts   : int  189 0 2 0 0 32 34 0 32 32 ...
##  $ MntGoldProds       : int  218 37 30 0 34 43 65 7 5 5 ...
##  $ NumWebPurchases    : int  4 7 3 1 3 4 10 2 6 6 ...
##  $ NumCatalogPurchases: int  4 3 2 0 1 7 10 1 2 2 ...
##  $ NumStorePurchases  : int  6 7 5 2 2 5 7 3 9 9 ...
##    Year_Birth    Education         Marital_Status         Income      
##  Min.   :1893   Length:2240        Length:2240        Min.   :  1730  
##  1st Qu.:1959   Class :character   Class :character   1st Qu.: 35303  
##  Median :1970   Mode  :character   Mode  :character   Median : 51382  
##  Mean   :1969                                         Mean   : 52247  
##  3rd Qu.:1977                                         3rd Qu.: 68522  
##  Max.   :1996                                         Max.   :666666  
##                                                       NA's   :24      
##     Kidhome          Teenhome      Dt_Customer           MntWines      
##  Min.   :0.0000   Min.   :0.0000   Length:2240        Min.   :   0.00  
##  1st Qu.:0.0000   1st Qu.:0.0000   Class :character   1st Qu.:  23.75  
##  Median :0.0000   Median :0.0000   Mode  :character   Median : 173.50  
##  Mean   :0.4442   Mean   :0.5062                      Mean   : 303.94  
##  3rd Qu.:1.0000   3rd Qu.:1.0000                      3rd Qu.: 504.25  
##  Max.   :2.0000   Max.   :2.0000                      Max.   :1493.00  
##                                                                        
##    MntFruits     MntMeatProducts MntFishProducts  MntSweetProducts
##  Min.   :  0.0   Min.   :   0    Min.   :  0.00   Min.   :  0.00  
##  1st Qu.:  1.0   1st Qu.:  16    1st Qu.:  3.00   1st Qu.:  1.00  
##  Median :  8.0   Median :  67    Median : 12.00   Median :  8.00  
##  Mean   : 26.3   Mean   : 167    Mean   : 37.53   Mean   : 27.06  
##  3rd Qu.: 33.0   3rd Qu.: 232    3rd Qu.: 50.00   3rd Qu.: 33.00  
##  Max.   :199.0   Max.   :1725    Max.   :259.00   Max.   :263.00  
##                                                                   
##   MntGoldProds    NumWebPurchases  NumCatalogPurchases NumStorePurchases
##  Min.   :  0.00   Min.   : 0.000   Min.   : 0.000      Min.   : 0.00    
##  1st Qu.:  9.00   1st Qu.: 2.000   1st Qu.: 0.000      1st Qu.: 3.00    
##  Median : 24.00   Median : 4.000   Median : 2.000      Median : 5.00    
##  Mean   : 44.02   Mean   : 4.085   Mean   : 2.662      Mean   : 5.79    
##  3rd Qu.: 56.00   3rd Qu.: 6.000   3rd Qu.: 4.000      3rd Qu.: 8.00    
##  Max.   :362.00   Max.   :27.000   Max.   :28.000      Max.   :13.00    
## 

Personal variables

Product Variables

Principle Component Analysis
##     MntWines        MntFruits      MntMeatProducts MntFishProducts 
##  Min.   :   0.0   Min.   :  0.00   Min.   :  0.0   Min.   :  0.00  
##  1st Qu.:  22.0   1st Qu.:  1.00   1st Qu.: 14.0   1st Qu.:  2.00  
##  Median : 154.0   Median :  7.00   Median : 57.0   Median : 11.00  
##  Mean   : 277.9   Mean   : 22.73   Mean   :142.6   Mean   : 32.47  
##  3rd Qu.: 462.0   3rd Qu.: 28.00   3rd Qu.:189.0   3rd Qu.: 39.00  
##  Max.   :1285.0   Max.   :172.00   Max.   :913.0   Max.   :225.00  
##  MntSweetProducts  MntGoldProds   
##  Min.   :  0.00   Min.   :  0.00  
##  1st Qu.:  1.00   1st Qu.:  8.00  
##  Median :  7.00   Median : 22.00  
##  Mean   : 23.41   Mean   : 39.75  
##  3rd Qu.: 29.00   3rd Qu.: 52.00  
##  Max.   :176.00   Max.   :224.00

##                   Comp1  Comp2  Comp3 communality
## MntWines         -0.706 -0.486  0.439    0.927353
## MntFruits        -0.800  0.281 -0.054    0.721877
## MntMeatProducts  -0.844 -0.043  0.287    0.796554
## MntFishProducts  -0.819  0.264 -0.114    0.753453
## MntSweetProducts -0.795  0.333 -0.064    0.747010
## MntGoldProds     -0.638 -0.511 -0.571    0.994206
  • Component 1: index of how much the customer dislike purchase
  • Component 2: index of how much the customer dislike wine
  • Component 3: index of how much the customer dislike gold
Clusters as new category values

  • Cluster 1: active shopper
  • Cluster 2: wine lover
  • Cluster 3: gold lover
  • Cluster 4: inactive shopper

Variable Summary

  • 7 numeric variables
  • 3 categorical variables
##       Education        Income      NumWebPurchases  NumCatalogPurchases
##  Graduation:1074   Min.   : 1730   Min.   : 0.000   Min.   : 0.000     
##  Master    : 529   1st Qu.:34236   1st Qu.: 2.000   1st Qu.: 0.000     
##  PhD       : 440   Median :49118   Median : 3.000   Median : 1.000     
##                    Mean   :49705   Mean   : 3.968   Mean   : 2.363     
##                    3rd Qu.:65526   3rd Qu.: 6.000   3rd Qu.: 4.000     
##                    Max.   :94384   Max.   :11.000   Max.   :10.000     
##  NumStorePurchases      Age         Seniority       Children        Living    
##  Min.   : 0.000    Min.   :25.0   Min.   :2621   Min.   :0.000   couple:1332  
##  1st Qu.: 3.000    1st Qu.:44.0   1st Qu.:2802   1st Qu.:1.000   single: 711  
##  Median : 5.000    Median :51.0   Median :2977   Median :1.000                
##  Mean   : 5.681    Mean   :51.9   Mean   :2975   Mean   :1.001                
##  3rd Qu.: 8.000    3rd Qu.:61.0   3rd Qu.:3150   3rd Qu.:1.000                
##  Max.   :13.000    Max.   :76.0   Max.   :3320   Max.   :3.000                
##              Product    
##  active shopper  : 259  
##  gold lover      : 194  
##  inactive shopper:1253  
##  wine lover      : 337  
##                         
## 

Hierarchical Clustering

##              3 clusters 4 clusters
## Dunn's Index     0.1389     0.1476
Dendrogram

Cluster Interpretation

Categorical variables

Numeric variables

Features and Clusters